Minimizing communication overhead for matrix inversion algorithms on hypercubes

نویسندگان

  • Xiaodong Wang
  • Vwani P. Roychowdhury
چکیده

The mirin contribution of this report is the development of novel algorithms {that make efficient use of the communication system in distributed memory architectures with plrocessing elements interconnected by a hypercube network. These algorithms achieve almost optirr~al overlap of communicatior~ delays by computation, leading to a minimization of communicatioi~ overhead. Rigorous ana1yt:ical and numerical performance analysis of our parallel algorithms are presented as well. The parallel algorithm under study in this report is the parallel Gauss-Jordan matrix inversion algorithm. Many of the ideas introduced in this report, however, apply to other linear algebra algorithms as well. Parallel Gauss-Jordan matrix inversion algorithms on the hypercube multiprocessors have been extensively studied in the literature. Two common data partitioning strategies for matrix algorithms are row-wise partitioning and submatrix partitioning. It has been claimed that for the parallel G,J inversion algorithm, submatrix partitioning scheme exhibit commlinication overhead advantages; not shared by partitions limited to rows or columns. Most parallel algorithms proposed in the literature, however, do not attempt to overlap inter-processor communication by computation. As zr result, the formula execution time=computation time + communication time is used to analyze the complexity of the parallel algorithm. However, during most of the communication time, the processors are actually idle, waiting for the data to arrive. Most c~3mmercially available parallel machines provide communication interrupt handling capability. By utilizing this feature, we believe a lot of parallel mat,rix algorithms can be improved by overlapping interprocessor communication and computation. In this report we piropose and analyze new parallel GJ inversion algorithms under different data partitioning strategies, with or without partial pivoting. We first propose a new broadcasting algorithm on the hypercube muli,iprocessor for parallel GJ algorithm. This algorithm ensures that the data are sent out from the source and arrives at the destinations at the earliest possible time. We then give the parallel GJ inversion algorithm using row partitioning. The strategy to overlap communication by computation is for each processor to compute arnd send out the data needed by the other processors as early as possible. We prove a lower bouind on the matrix size such that data transmission is flilly overlapped by computation. We also prove that the message length in the input buffer of each processor is art most 2. We also consider the algorithms under submatrix partitioning, with or without pivoting. We show that when submatrix partitioning is used, even when t,he communication is fully overlapped by compul;ation, the communication overhead is larger than when using row partitioning. Thus, we show that by minimizing communication overhead, the row partitioning scheme can indeed have better overall performance than the submatrix partitioning scheme. Finally we extend the idea of overlapping communication and computation to the parallel LU factorization algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Direct Matrix Inversion-Less Analysis for Distribution System Power Flow Considering Distributed Generation

This paper presents a new direct matrix inversion-less analysis for radial distribution systems (RDSs). The method can successfully deal with weakly meshed distribution systems. (WMDSs). Being easy to implement, direct methods (DMs) provide an excellent performance. Matrix inversion is the mean reason of divergence and low-efficiency in power flow algorithms. In this paper, the performance of t...

متن کامل

Overlapping Communication and Computation in Hypercubes

This paper presents a method to derive efficient algorithms for hypercubes. The method exploits two features of the underlying hardware: a) the parallelism provided by the multiple communication links of each node and b) the possibility of overlapping computations and communications, which is a feature of machines supporting an asynchronous communication protocol. The method can be applied to a...

متن کامل

Minimizing Overhead in Parallel Algorithms through Overlapping Communication/computation

One of the major goals in the design of parallel processing machines and algorithms is to reduce the effects of the overhead introduced when a given problem is parallelized. A key contributor to overhead is communication time. Many architectures try to reduce this overhead by minimizing the actual time for communication, including latency and bandwidth. Another approach is to hide communication...

متن کامل

Optimal Processor Mapping for Linear-Complement Communication on Hypercubes

ÐIn this paper, we address the problem of minimizing channel contention of linear-complement communication on wormholerouted hypercubes. Our research reveals that, for traditional routing algorithms, the degree of channel contention of a linearcomplement communication can be quite large. To solve this problem, we propose an alternative approach, which applies processor reordering mapping at com...

متن کامل

Fault - Tolerant Sorting Algorithm on

In this paper, algorithmic fault-tolerant techniques are introduced for sorting algorithms on n-dimensional hypercube multicomputers. We propose a fault-tolerant sorting algorithm that can tolerate up n 1 faulty processors. First, we indicate that the bitonic sorting algorithm can perform sorting operations correctly on the hypercubes with one faulty processor. In order to tolerate up r n 1 fau...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995